Goto

Collaborating Authors

 cvpr 2019


To Recognition

Neural Information Processing Systems

However, mostsamplesarestill -Augfortheheadormiddle (see Fig.2(green)). Hence, (x; coulduselikelihoodP(X = x|Y = y) oftrain P(Y =y|X =x)intestset.Ptrain(Y =yi)= yi andPt yi) 1/C,priorcannotbe prior, thelearned , W, b, willyield isnolongerpriorshould correctness By = log y) + logC) Furthermore, By enables likelihood, TheoremBy-compensated LB(yi, (x; )) = l h 1+ X E2E: end : ourreproduced : results Dataset ImageNet-LiNaturalist Method E2EResNet-10 ResNet-50 ResNet-50 CE 3 35.88



165a59f7cf3b5c4396ba65953d679f17-AuthorFeedback.pdf

Neural Information Processing Systems

Thank you for your comments. What's the difference between Deeploss-VGG/-Squeeze and the loss proposed in [29] (LPIPS)? We wanted a consistent naming scheme in the paper, but see that this can be confusing. We consider renaming it to LPIPS-VGG and LPIPS-Squeeze. Quantitative analysis on other score-functions We will provide a measure in the updated paper.


Review for NeurIPS paper: Attribute Prototype Network for Zero-Shot Learning

Neural Information Processing Systems

Weaknesses: Novelty 1- The proposed model is mainly building on previous ideas: [8] for learning prototypes, [15] for decorrelation and sharing, [52] for localization compactness, and [7] for score calibration. This renders the technical novelty to be somewhat limited. Nonetheless, I find the employment of these ideas together for attribute localization and ZSL is quite interesting and seems to lead to consistent good performance. Model: 2- It seems that the model uses continuous attributes. This type of attributes is usually obtained averaging the image-level binary attributes for each class which is expensive to obtain.


Review for NeurIPS paper: Variational Amodal Object Completion

Neural Information Processing Systems

Weaknesses: Weaknesses: - Some masks do not look reasonable when visual cues are considered, for example, the top left mask of Figure 7, which nevertheless might make sense without the image shown. Despite the paper saying "due to the nature of our Amodal-VAE, we discard RGB pixels...", I wonder if the VAE is also able to condition on the instance appearance somehow and if it helps. It didn't surprise me much, as it makes training input more "noisy" and training more easily overfitting (to some RGB features). Humans can leverage RGB in sort of a reasoning way, i.e. when the mask can have two explanations, use RGB to match the two hypothesis via some mental simulation, and decide. This can be too hard for neural networks trained for one task.


Review for NeurIPS paper: Look-ahead Meta Learning for Continual Learning

Neural Information Processing Systems

Weaknesses: I found some issues with the experiments, that I list in the following: Line 215 states that experiments refer to "task incremental settings". This term has a specific meaning in CL literature [3,4]: it usually means "multihead", i.e. task labels are given at inference time. I understand that this is the setting that is featured in section 5.2. Recent literature [1, 2, 5, 6] argues that this setting is trivial and that the Single-head/Class-Incremental setting (i.e. Providing Class-IL results could therefore be of great help to understand how LA-MAML performs in a more challenging setting.


SimROD: A Simple Adaptation Method for Robust Object Detection

Ramamonjison, Rindra, Banitalebi-Dehkordi, Amin, Kang, Xinyu, Bai, Xiaolong, Zhang, Yong

arXiv.org Artificial Intelligence

This paper presents a Simple and effective unsupervised adaptation method for Robust Object Detection (SimROD). To overcome the challenging issues of domain shift and pseudo-label noise, our method integrates a novel domain-centric augmentation method, a gradual self-labeling adaptation procedure, and a teacher-guided fine-tuning mechanism. Using our method, target domain samples can be leveraged to adapt object detection models without changing the model architecture or generating synthetic data. When applied to image corruptions and high-level cross-domain adaptation benchmarks, our method outperforms prior baselines on multiple domain adaptation benchmarks. SimROD achieves new state-of-the-art on standard real-to-synthetic and cross-camera setup benchmarks. On the image corruption benchmark, models adapted with our method achieved a relative robustness improvement of 15-25% AP50 on Pascal-C and 5-6% AP on COCO-C and Cityscapes-C. On the cross-domain benchmark, our method outperformed the best baseline performance by up to 8% AP50 on Comic dataset and up to 4% on Watercolor dataset.


CVPR 2019 WAD Challenge on Trajectory Prediction and 3D Perception

Zhang, Sibo, Ma, Yuexin, Yang, Ruigang, Li, Xin, Zhu, Yanliang, Qian, Deheng, Yang, Zetong, Zhang, Wenjing, Liu, Yuanpei

arXiv.org Artificial Intelligence

This paper reviews the CVPR 2019 challenge on Autonomous Driving. Baidu's Robotics and Autonomous Driving Lab (RAL) providing 150 minutes labeled Trajectory and 3D Perception dataset including about 80k lidar point cloud and 1000km trajectories for urban traffic. The challenge has two tasks in (1) Trajectory Prediction and (2) 3D Lidar Object Detection. There are more than 200 teams submitted results on Leaderboard and more than 1000 participants attended the workshop.


NVIDIA Researchers Present Pixel Adaptive Convolutional Neural Networks at CVPR 2019 - NVIDIA Developer News Center

#artificialintelligence

Despite the widespread use of convolutional neural networks (CNN), the convolution operations used in standard CNNs have some limitations. To overcome these limitations, Researchers from NVIDIA and University of Massachusetts Amherst, developed a new type of convolutional operations that can dynamically adapt to input images to generate filters specific to the content. The researchers will present their work at the annual Computer Vision and Pattern Recognition (CVPR) conference in Long Beach, California this week. "Convolutions are the fundamental building blocks of CNNs," the researchers wrote in the research paper, "the fact that their weights are spatially shared is one of the main reasons for their widespread use, but it is also a major limitation, as it makes convolutions content-agnostic". To help improve the efficiency of CNNs, the team proposed a generalization of convolutional operation, Pixel-Adaptive Convolution (PAC), to mitigate the limitation.


Triple 'Strong Accept' for CVPR 2019: Reinforced Cross-Modal Matching & Self-Supervised Imitation…

#artificialintelligence

The Conference on Computer Vision and Pattern Recognition (CVPR) is one of the world's top computer vision (CV) conferences. CVPR 2019 runs June 15 through June 21 in Long Beach, California, and the list of accepted papers for the prestigious gathering has now been released. A total of 1300 papers were accepted from a record-high 5165 submissions this year, and one standout already garnering attention is Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation. The paper is said to have received all three "Strong Accepts" in the peer review and ranks 1, according to University of California, Santa Barbara NLP Group Director William Wang, who is also one of the paper's authors. The paper proposes a new method for vision-language navigation (VLN) tasks that combines the strengths of both reinforcement learning and self-supervised imitation learning.